Extending Decision Tree Clasifiers for Uncertain Data

نویسندگان

  • M. Suresh
  • Krishna Reddy
  • R. Jayasree
چکیده

Traditionally, decision tree classifiers work with data whose values are known and precise. We extend such classifiers to handle data with uncertain information. Value uncertainty arises in many applications during the data collection process. Example sources of uncertainty include measurement/quantization errors, data staleness, and multiple repeated measurements. With uncertainty, the value of a data item is often represented not by one single value, but by multiple values forming a probability distribution. Rather than abstracting uncertain data by statistical derivatives (such as mean and median), we discover that the accuracy of a decision treeing uncertain data by statistical derivatives (such as mean and median), we discover that the accuracy of a decision tree classifier can be much improved if the “complete information” of a data item (taking into account the probability density function (pdf)) is utilized. [1] We extend classical decision tree building algorithms to handle data tuples with uncertain values. Extensive experiments have been conducted that show that the resulting classifiers are more accurate than those using value averages. Since processing pdf’s is computationally more costly than processing single values (e.g., averages). ---------------------------------------------------------------------***-------------------------------------------------------------------------

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Classification: A Decision Tree For Uncertain Data Using CDF

The Decision trees are suitable and widely used for describing classification phenomena. This paper present a decision tree based classification system for uncertain data. The uncertain data means lack of certainty. Data uncertainty comes by different parameters including sensor error, network latency measurements precision limitation and multiple repeated measurements. We find that decision tr...

متن کامل

Classification of Categorical Uncertain Data Using Decision Tree

Certain data is a data whose values are known precisely whereas uncertain data means whose value are not known precisely. But data is always uncertain in real life applications. In data uncertainty attribute value is represented by a set of values. There are two types of attributes in data sets namely, numerical and categorical attributes. Data uncertainty can arise in both numerical and catego...

متن کامل

DTU: A Decision Tree for Uncertain Data

Decision Tree is a widely used data classification technique. This paper proposes a decision tree based classification method on uncertain data. Data uncertainty is common in emerging applications, such as sensor networks, moving object databases, medical and biological bases. Data uncertainty can be caused by various factors including measurements precision limitation, outdated sources, sensor...

متن کامل

Research on Dynamic Cost-sensitive Decision Tree for Mining Uncertain Data Based on the Genetic Algorithm

The existing classifiers for uncertain data don’t consider the dynamic cost, so this paper proposes the classification approach of the dynamic cost-sensitive decision tree for uncertain data based on the genetic algorithm (GDCDTU) , which overcomes the limitations of the stationary cost, and searches automatically the suitable cost space of every sub datasets. Firstly, this paper gives the dyna...

متن کامل

Performance Analysis on Uncertain Data using Decision Tree

Data uncertainty is common in emerging applications, such as sensor networks, moving object databases, medical and biological fields. Data uncertainty can be caused by various factors including measurements precision limitation. Data uncertainty is inherited in various applications due to different reasons such as outdated sources or imprecise measurement and transmission problems. Classificati...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012